I. Introduction

Table 1.Sample for 5 randomly chosen countries of the data set used in this study
Country cumulative_confirmed_cases_per_10000 Stringency_Index Economic_Support_Index Economic_Support_Index_levels
Poland 60.129772 50.93 37.5 [37.5,50)
Bangladesh 24.312931 62.96 50.0 [50,62.5)
Togo 2.674959 53.70 50.0 [50,62.5)
Dominican Republic 114.417991 78.70 25.0 [25,37.5)
Mozambique 3.868796 62.04 0.0 [0,12.5)
Country Population2019 age15_64_population_prop_2019 nurses_midwives_per_1000_2018 Smoking_prevalence_15_2016
Poland 37970874 66.69772 6.8926 28.0
Bangladesh 163046161 67.60528 0.4124 23.0
Togo 8082366 56.11430 0.4102 7.4
Dominican Republic 10738958 64.99292 1.3802 13.7
Mozambique 30366036 52.75355 0.6847 16.6

II. Exploratory data analysis


Table 2: Summary for the cumulative confirmed cases per 10,000
n min median mean max sd
78 0.0334753 27.50465 67.20095 461.5392 92.11358
Figure 1. Distribution for the cumulative confirmed cases per 10,000 for individual countries

Figure 1. Distribution for the cumulative confirmed cases per 10,000 for individual countries

Figure 2. Distribution for the government response measured by the Stringency Index

Figure 2. Distribution for the government response measured by the Stringency Index

Figure 3. Distribution for the government response measured by the Economic Support Index

Figure 3. Distribution for the government response measured by the Economic Support Index

Figure 4. Distribution for the Proportion of population that is 15-64 years old, in 2019 for individual countries

Figure 4. Distribution for the Proportion of population that is 15-64 years old, in 2019 for individual countries

Figure 5. Distribution for nurses and midwives per 1000 in 2018 for individual countries

Figure 5. Distribution for nurses and midwives per 1000 in 2018 for individual countries

Figure 6. Distribution for the Smoking Prevalence for 15+ years olds, in 2016 for individual countries

Figure 6. Distribution for the Smoking Prevalence for 15+ years olds, in 2016 for individual countries

Figure 7.1. Interactive Scatterplot for the cumulative confirmed cases per 10,000 for individual countries against their government response measured by the Stringency Index. The red line is the best fit line. The blue curve is the Loess curve. The vertical black lines indicate the chosen knot locations at the 5th percentile SI = 36.1100, 35th percentile SI = 61.0640, 65th percentile SI= 77.3335, 95th percentile SI= 92.7295

Figure 7.2. Interactive Scatterplot for the cumulative confirmed cases per 10,000 for individual countries against their government response measured by the Stringency Index, grouped by the Economic Support Index levels

Figure 7.2. Interactive Scatterplot for the cumulative confirmed cases per 10,000 for individual countries against their government response measured by the Stringency Index, grouped by the Economic Support Index levels

Figure 8. Interactive Scatterplot for the cumulative confirmed cases per 10,000 for individual countries against their government response measured by the Economic Support Index. The red line is the best fit line. The blue curve is the Loess curve.

Figure 9. Interactive Scatterplot for the cumulative confirmed cases per 10,000 for individual countries against their Proportion of population that is 15-64 years old, in 2019. The red line is the best fit line. The blue curve is the Loess curve. The vertical black lines indicate the chosen knot locations at the 5th percentile APP = 53.54730, 35th percentile APP = 62.88236, 65th percentile APP = 65.88236, 95th percentile APP = 72.63432.

Technically, could not find an article that proves it is relevant, even though we think there may be some relevance, so no splines here.

Figure 10. Interactive Scatterplot for the cumulative confirmed cases per 10,000 for individual countries against their Smoking prevalence for 15+ year olds in 2016. The red line is the best fit line. The blue curve is the Loess curve.

Figure 11. Interactive Scatterplot for the cumulative confirmed cases per 10,000 for individual countries against their Service coverage index in 2017. The red line is the best fit line. The blue curve is the Loess curve.The vertical black lines indicate the chosen knot locations at the 5th percentile NM = 0.434670, 35th percentile NM = 1.654765, 65th percentile NM = 4.196680, 95th percentile NM = 12.579690.

Figure 12. Boxplot of relationship between  the cumulative confirmed cases per 10,000 for individual countries and the Economic Support Index levels

Figure 12. Boxplot of relationship between the cumulative confirmed cases per 10,000 for individual countries and the Economic Support Index levels


III. Multiple linear regression

i. Methods


Figure 13. Distribution for the cumulative confirmed cases per 10,000 raised to 0.5, for individual countries

Figure 13. Distribution for the cumulative confirmed cases per 10,000 raised to 0.5, for individual countries

Using the following model:

## lm(formula = cumulative_confirmed_cases_per_10000_transf ~ ns(Stringency_Index, 
##     knots = c(36.11, 61.064, 77.3335, 92.7295)) + Economic_Support_Index + 
##     ns(age15_64_population_prop_2019, knots = c(53.5473, 62.88236, 
##         65.88236, 72.63432)) + ns(nurses_midwives_per_1000_2018, 
##     knots = c(0.43467, 1.654765, 4.19668, 12.57969)) + Smoking_prevalence_15_2016, 
##     data = tidy_joined_dataset)
Figure 14. Normal Q-Qplot for the cumulative number of confirmed cases per 10000, raised to 0.5

Figure 14. Normal Q-Qplot for the cumulative number of confirmed cases per 10000, raised to 0.5

Figure 15. Residuals distribution for the statistical model

Figure 15. Residuals distribution for the statistical model

Figure 16. Residuals graph for the fitted values, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 16. Residuals graph for the fitted values, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 17. Residuals graph for the Stringency Index, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 17. Residuals graph for the Stringency Index, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 18. Residuals graph for the Economic Support Index, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 18. Residuals graph for the Economic Support Index, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 19. Residuals graph for the Proportion of population that is 15-64 years old, in 2019, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 19. Residuals graph for the Proportion of population that is 15-64 years old, in 2019, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 20. Residuals graph for the Smoking prevalence for 15+ year olds in 2016, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 20. Residuals graph for the Smoking prevalence for 15+ year olds in 2016, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 21. Residuals graph for the nurses and midwives per 1000 in 2018, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 21. Residuals graph for the nurses and midwives per 1000 in 2018, with a Lowess curve in blue and a horizontal line at zero in red.

Table 3: VIF table

GVIF Df GVIF^(1/(2*Df))
ns(Stringency_Index, knots = c(36.11, 61.064, 77.3335, 92.7295)) 3.249592 5 1.125079
Economic_Support_Index 1.349972 1 1.161883
ns(age15_64_population_prop_2019, knots = c(53.5473, 62.88236, 65.88236, 72.63432)) 3.108132 5 1.120082
ns(nurses_midwives_per_1000_2018, knots = c(0.43467, 1.654765, 4.19668, 12.57969)) 5.211657 5 1.179499
Smoking_prevalence_15_2016 1.440917 1 1.200382

ii. Model Results and Interpretation


Table 4. Model Summary Table

## lm(formula = cumulative_confirmed_cases_per_10000_transf ~ ns(Stringency_Index, 
##     knots = c(36.11, 61.064, 77.3335, 92.7295)) + Economic_Support_Index + 
##     ns(age15_64_population_prop_2019, knots = c(53.5473, 62.88236, 
##         65.88236, 72.63432)) + ns(nurses_midwives_per_1000_2018, 
##     knots = c(0.43467, 1.654765, 4.19668, 12.57969)) + Smoking_prevalence_15_2016, 
##     data = tidy_joined_dataset)
Estimate Std. Error t value Pr(>|t|)
(Intercept) -10.4612 4.7789 -2.1890 0.0325
ns(Stringency_Index, knots = c(36.11, 61.064, 77.3335, 92.7295))1 10.1282 3.6028 2.8112 0.0067
ns(Stringency_Index, knots = c(36.11, 61.064, 77.3335, 92.7295))2 9.3930 4.0014 2.3474 0.0222
ns(Stringency_Index, knots = c(36.11, 61.064, 77.3335, 92.7295))3 17.0716 3.5971 4.7459 0.0000
ns(Stringency_Index, knots = c(36.11, 61.064, 77.3335, 92.7295))4 16.2272 7.7118 2.1042 0.0396
ns(Stringency_Index, knots = c(36.11, 61.064, 77.3335, 92.7295))5 0.7466 3.0984 0.2410 0.8104
Economic_Support_Index 0.0257 0.0175 1.4694 0.1470
ns(age15_64_population_prop_2019, knots = c(53.5473, 62.88236, 65.88236, 72.63432))1 1.1669 3.1130 0.3748 0.7091
ns(age15_64_population_prop_2019, knots = c(53.5473, 62.88236, 65.88236, 72.63432))2 6.4597 3.4426 1.8764 0.0655
ns(age15_64_population_prop_2019, knots = c(53.5473, 62.88236, 65.88236, 72.63432))3 2.8432 3.7089 0.7666 0.4463
ns(age15_64_population_prop_2019, knots = c(53.5473, 62.88236, 65.88236, 72.63432))4 11.7261 6.7166 1.7458 0.0860
ns(age15_64_population_prop_2019, knots = c(53.5473, 62.88236, 65.88236, 72.63432))5 12.4694 3.2654 3.8186 0.0003
ns(nurses_midwives_per_1000_2018, knots = c(0.43467, 1.654765, 4.19668, 12.57969))1 1.7848 2.2335 0.7991 0.4274
ns(nurses_midwives_per_1000_2018, knots = c(0.43467, 1.654765, 4.19668, 12.57969))2 -0.0042 3.4184 -0.0012 0.9990
ns(nurses_midwives_per_1000_2018, knots = c(0.43467, 1.654765, 4.19668, 12.57969))3 9.8196 3.7608 2.6110 0.0114
ns(nurses_midwives_per_1000_2018, knots = c(0.43467, 1.654765, 4.19668, 12.57969))4 8.7379 4.2247 2.0683 0.0429
ns(nurses_midwives_per_1000_2018, knots = c(0.43467, 1.654765, 4.19668, 12.57969))5 8.2892 2.9914 2.7710 0.0074
Smoking_prevalence_15_2016 -0.0427 0.0498 -0.8574 0.3946
Value df
Residual Standard Error 3.467 60
Multiple R-squared 0.643
Adjusted R-squared 0.542
Value Numerator df Denominator df
Model F-statistic 6.363 17 60
P-value 3.565e-08

\[\begin{aligned} H_0:&\beta_{SI, 0-25} = 0 \\\ \mbox{vs }H_A:& \beta_{SI, 0-25} \neq 0 \end{aligned}\] \[\begin{aligned} H_0:&\beta_{SI, 25-50} = 0 \\\ \mbox{vs }H_A:& \beta_{SI, 25-50} \neq 0 \end{aligned}\] \[\begin{aligned} H_0:&\beta_{SI, 50-75} = 0 \\\ \mbox{vs }H_A:& \beta_{SI,50-75} \neq 0 \end{aligned}\] \[\begin{aligned} H_0:&\beta_{SI, 75-100} = 0 \\\ \mbox{vs }H_A:& \beta_{SI, 75-100} \neq 0 \end{aligned}\]

\[\begin{aligned} H_0:&\beta_{15to65 APP, 50-67.5} = 0 \\\ \mbox{vs }H_A:& \beta_{15to65 APP, 50-67.5} \neq 0 \end{aligned}\] \[\begin{aligned} H_0:&\beta_{15to65 APP, 67.5-80} = 0 \\\ \mbox{vs }H_A:& \beta_{15to65 APP, 67.5-80} \neq 0 \end{aligned}\] \[\begin{aligned} H_0:&\beta_{NM, 0-10} = 0 \\\ \mbox{vs }H_A:& \beta_{NM, 0-10} \neq 0 \end{aligned}\] \[\begin{aligned} H_0:&\beta_{NM, 10-20} = 0 \\\ \mbox{vs }H_A:& \beta_{NM, 10-20} \neq 0 \end{aligned}\]

\[\begin{aligned} H_0:&\beta_{ESI} = 0 \\\ \mbox{vs }H_A:& \beta_{ESI} \neq 0 \end{aligned}\] \[\begin{aligned} H_0:&\beta_{SP} = 0 \\\ \mbox{vs }H_A:& \beta_{SP} \neq 0 \end{aligned}\]

iii. Inference for multiple regression

Table 5. ANOVA Table
Df Sum Sq Mean Sq F value Pr(>F)
ns(Stringency_Index, knots = c(36.11, 61.064, 77.3335, 92.7295)) 5 390.6686 78.1337 6.4999 0.0001
Economic_Support_Index 1 126.2974 126.2974 10.5065 0.0019
ns(age15_64_population_prop_2019, knots = c(53.5473, 62.88236, 65.88236, 72.63432)) 5 531.3317 106.2663 8.8402 0.0000
ns(nurses_midwives_per_1000_2018, knots = c(0.43467, 1.654765, 4.19668, 12.57969)) 5 243.1708 48.6342 4.0458 0.0031
Smoking_prevalence_15_2016 1 8.8368 8.8368 0.7351 0.3946
Residuals 60 721.2501 12.0208 NA NA

Figure 22. Interactive Scatterplot for the cumulative confirmed cases per 10,000 (raised to 0.5) for individual countries against their government response measured by the Stringency Index, where median economic support index = 50, median population proportion of ages 15 to 64 in 2019 = 64.65951, median nurses midwives per 1000 in 2018 = 2.47285, and median Smoking prevalence for people ages 15+ in 2016 = 17.05. The blue line is the spline, with its associated 95% CI and wider pink 95% PI. The vertical black lines indicate the chosen knot locations at the 5th percentile SI = 36.1100, 35th percentile SI = 61.0640, 65th percentile SI= 77.3335, 95th percentile SI= 92.7295.

Figure 23. Interactive Scatterplot for the cumulative confirmed cases per 10,000 (raised to 0.5) for individual countries against their nurses and midwives per 1000 in 2018, where median Stringency index = 70.14, median economic support index = 50, median population proportion of ages 15 to 64 in 2019 = 64.65951, and median Smoking prevalence for people ages 15+ in 2016 = 17.05. The blue line is the spline, with its associated 95% CI and wider pink 95% PI. The vertical black lines indicate the chosen knot locations at the 5th percentile NM = 0.434670, 35th percentile NM = 1.654765, 65th percentile NM = 4.196680, 95th percentile NM = 12.579690.

Figure 24. Interactive Scatterplot for the cumulative confirmed cases per 10,000 (raised to 0.5) for individual countries against their Proportion of population that is 15-64 years old, in 2019, where median Stringency index = 70.14, median economic support index = 50, median nurses midwives per 1000 in 2018 = 2.47285, and median Smoking prevalence for people ages 15+ in 2016 = 17.05. The blue line is the spline, with its associated 95% CI and wider pink 95% PI.The vertical black lines indicate the chosen knot locations at the 5th percentile APP = 53.54730, 35th percentile APP = 62.88236, 65th percentile APP = 65.88236, 95th percentile APP = 72.63432.

Figure 25. Interactive Scatterplot for the cumulative confirmed cases per 10,000 (raised to 0.5) for individual countries against their Smoking prevalence for 15+ year olds in 2016, where median Stringency index = 70.14, median economic support index = 50, median population proportion of ages 15 to 64 in 2019 = 64.65951, and median nurses midwives per 1000 in 2018 = 2.47285. The blue line is the spline, with its associated 95% CI and wider pink 95% PI.

Figure 26. Interactive Scatterplot for the cumulative confirmed cases per 10,000 (raised to 0.5) for individual countries against their government response measured by the Economic Support Index, where median Stringency index = 70.14, median population proportion of ages 15 to 64 in 2019 = 64.65951, median nurses midwives per 1000 in 2018 = 2.47285, and median Smoking prevalence for people ages 15+ in 2016 = 17.05. The blue line is the spline, with its associated 95% CI and wider pink 95% PI.

If I transformed them back, then the intervals will not be symmetric.

Table 6. The 95% Prediction intervals for the \(\text{cumulative confirmed cases per 10,000}^{0.5}\), where Stringency Index = 20, 50, 70.14, 90, respectively, for median economic support index = 50, median population proportion of ages 15 to 64 in 2019 = 64.65951, median nurses midwives per 1000 in 2018 = 2.47285, and median Smoking prevalence for people ages 15+ in 2016 = 17.05.
SI Point Estimate Lower Limit Upper Limit
30.00 -0.52479 -8.48049 7.43092
50.00 4.94393 -2.50340 12.39126
70.14 5.74856 -1.50817 13.00530
90.00 10.64518 3.13071 18.15964

IV. Discussion

i. Conclusions

ii. Limitations

iii. Further questions


V. Citations and References